首页> 外文OA文献 >A New Design of Multiple Classifier System and its Application to Classification of Time Series Data
【2h】

A New Design of Multiple Classifier System and its Application to Classification of Time Series Data

机译:多重分类器系统的新设计及其在时间序列数据分类中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

To solve the challenging pattern classification problem, machine learning researchers have extensively studied Multiple Classifier Systems (MCSs). The motivations for combining classifiers are found in the literature from the statistical, computational and representational perspectives. Although the results of classifier combination does not always outperform the best individual classifier in the ensemble, empirical studies have demonstrated its superiority for various applications. A number of viable methods to design MCSs have been developed including bagging, adaboost, rotation forest, and random subspace. They have been successfully applied to solve various tasks. Currently, most of the research is being conducted on the behavior patterns of the base classifiers in the ensemble. However, a discussion from the learning point of view may provide insights into the robust design of MCSs. In this thesis, Generalized Exhaustive Search and Aggregation (GESA) method is developed for this objective. Robust performance is achieved using GESA by dynamically adjusting the trade-off between fitting the training data adequately and preventing the overfitting problem. Besides its learning algorithm, GESA is also distinguished from traditional designs by its architecture and level of decision-making. GESA generates a collection of ensembles and dynamically selects the most appropriate ensemble for decision-making at the local level. Although GESA provides a good improvement over traditional approaches, it is not very data-adaptive. A data- adaptive design of MCSs demands that the system can adaptively select representations and classifiers to generate effective decisions for aggregation. Another weakness of GESA is its high computation cost which prevents it from being scaled to large ensembles. Generalized Adaptive Ensemble Generation and Aggregation (GAEGA) is an extension of GESA to overcome these two difficulties. GAEGA employs a greedy algorithm to adaptively select the most effective representations and classifiers while excluding the noise ones as much as possible. Consequently, GAEGA can generate fewer ensembles and significantly reduce the computation cost. Bootstrapped Adaptive Ensemble Generation and Aggregation (BAEGA) is another extension of GESA, which is similar with GAEGA in the ensemble generation and decision aggregation. BAEGA adopts a different data manipulation strategy to improve the diversity of the generated ensembles and utilize the information in the data more effectively. As a specific application, the classification of time series data is chosen for the research reported in this thesis. This type of data contains dynamic information and proves to be more complex than others. Multiple Input Representation-Adaptive Ensemble Generation and Aggregation (MIR-AEGA) is derived from GAEGA for the classification of time series data. MIR-AEGA involves some novel representation methods that proved to be effective for time series data. All the proposed methods including GESA, GAEGA, MIR-AEGA, and BAEGA are tested on simulated and benchmark data sets from popular data repositories. The experimental results confirm that the newly developed methods are effective and efficient.
机译:为了解决具有挑战性的模式分类问题,机器学习研究人员已经广泛研究了多分类器系统(MCS)。从统计,计算和表示的角度来看,在文献中发现了组合分类器的动机。尽管分类器组合的结果并不总是比整体分类器中最好的单个分类器好,但是实证研究表明它在各种应用中的优越性。已经开发出许多可行的设计MCS的方法,包括套袋,adaboost,旋转林和随机子空间。它们已成功应用于解决各种任务。当前,大多数研究是针对集合中基本分类器的行为模式进行的。但是,从学习的角度进行的讨论可能会提供对MCS健壮设计的见解。为此,本文提出了通用穷举搜索和聚合(GESA)方法。使用GESA,可以通过动态地适当调整训练数据与防止过度拟合问题之间的折衷来实现稳定的性能。除了学习算法外,GESA的体系结构和决策水平也与传统设计不同。 GESA生成合奏的集合,并动态选择最合适的集合用于地方一级的决策。尽管GESA相对于传统方法提供了很好的改进,但它不是非常适应数据的。 MCS的数据自适应设计要求系统可以自适应地选择表示形式和分类器,以生成有效的聚合决策。 GESA的另一个弱点是其高昂的计算成本,这使其无法扩展到大型集成体。通用自适应集合生成和聚合(GAEGA)是GESA的扩展,可以克服这两个难题。 GAEGA采用贪婪算法来自适应地选择最有效的表示和分类器,同时尽可能地排除噪声。因此,GAEGA可以生成更少的合奏并显着降低计算成本。自举自适应集合生成和聚合(BAEGA)是GESA的另一种扩展,在集合生成和决策聚合方面与GAEGA相似。 BAEGA采用不同的数据处理策略来改善所生成乐团的多样性,并更有效地利用数据中的信息。作为一种具体的应用,本文选择了时间序列数据的分类方法进行研究。这种类型的数据包含动态信息,并被证明比其他数据更复杂。 GAEGA派生了用于时间序列数据分类的多输入表示自适应集合生成和聚合(MIR-AEGA)。 MIR-AEGA涉及一些新颖的表示方法,这些方法被证明对时间序列数据有效。所有提议的方法,包括GESA,GAEGA,MIR-AEGA和BAEGA,都在流行数据库中的模拟和基准数据集上进行了测试。实验结果证实了新开发的方法是有效的。

著录项

  • 作者

    Chen, Lei;

  • 作者单位
  • 年度 2007
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号